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Users of software systems acquire knowledge about the system 
and how to use it through experience, training, and imitation. 
Currently, there is a great deal of debate about exactly what users 
know about software. This knowledge may include one or more of 
the following: 

• simple rules that prescribe a sequence of actions that apply 
under certain conditions; 

• general methods that fit certain general situations and 
goals; and 

• mental modelt, knowledge of the components of a system, 
their interconnection, and the processes that change the 
components; knowledge that forms the basis for users be- 
ing able to construct reasonable actions; and explanations 
about why a set of actions is appropriate. 

Discovering what users know and how these different forms 
of knowledge fit together in learning and performance is impor- 
tant. It applies to the problem of designing systems and training 
programs so that the systems are easy to use and the learning is 
efficient. Research on the effects of different representations on 
ultimate performance is mixed. Research on exactly what users 
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know is scattered. Analytical methods and techniques for repre- 
senting what the user knows are sparse but growing. 

This report reviews current work and through the review, 
identifies several important research needs: 

• Detail what kinds of mental representations people have of 
systems that allow them to behave appropriately in using 
the software. 

• Detail what a mental model would consist of and how a 
person would use it to decide what action to take next. 

• Produce evidence that people have and use mental models. 

• Determine the behaviors that would demonstrate a mental 
model’s form and the operations used on the model. 

• Explore alternative views of goal-directed representations 
(e.g., so-called sequence/method representations) and de- 
tail the behavior predicted from them. 

• Expand the types of mental representations that may exist 
to include those that may not be mechanistic, such as 
algebraic and visual systems. 

• Determine how people intermix different representations in 
producing behavior. 

• Explore how knowledge about systems is acquired. 

• Determine how individual differences have an impact on 
learning of and performance on systems. 

• Explore the design of training sequences for systems. 

• Provide systems designers with tools to help them develop 
systems that evoke “good” representations in users. 

• Expand the task domain of this research to include more 
complex software. 
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Foreword 


The Committee on Human Factors was established in October 
1980 by the Commission on Behavioral and Social Sciences and 
Education of the National Research Council. The committee is 
sponsored by the Office of Naval Research, the Army Research 
Institute for the Behavioral and Social Sciences, the National 
Aeronautics and Space Administration, and the National Science 
Foundation. The principal objectives of the committee are to 
provide new perspectives on theoretical and methodological issues, 
to identify basic research needed to expand and strengthen the 
scientific basis of human factors, and to attract scientists both 
within and outside the field for interactive communication and to 
perform needed research. The goal of the committee is to provide 
a solid foundation of research as a base on which effective human 
factors practices can build. 

Human factors issues arise in every domain in which humans 
interact with the products of a technological society. In order to 
perform its role effectively, the committee draws on experts from 
a wide range of scientific and engineering disciplines. Members of 
the committee include specialists in such fields as psychology, en- 
gineering, biomechanics, physiology, medicine, cognitive sciences, 
machine intelligence, computer sciences, sociology, education, and 
human factors engineering. Other disciplines are represented in 
the working groups, workshops, and symposia. Each of these dis- 
ciplines contributes to the basic data, theory and methods required 
to improve the scientific basis of human factors. 


Vll 


V 



Contents 


1. 1 *^> f- . » • • j • ; H . •• 


Preface *i 

Abstract xv 

Introduction 3 

Models of What, Held by Whom? 3 

Types of Representations of Users’ Knowledge 5 

Simple Sequences, 6 


Methods and Ways to Choose Among Them, 8 
Mental Models, 12 
Surrogates, 13 
Metaphor Models, 13 
Glass Box Models, 14 

Network Representations of the System, 15 
Comparisons, 17 

How Users’ Knowledge Affects Their Performance 19 

Chaos and Misconception in Both Novices and 
Experts, 20 

Skilled Performance, 21 

Applying What We Know of the User’s Knowledge to 


Practical Problems 23 

Designing Interfaces, 24 
User Training, 26 

Research Recommendations 29 

References 34 


Preface 


There has been a long-standing problem with inferring the 
causes of complex behavior. Mental events are not directly ob- 
servable; they must be inferred from overt behavior. Behaviorists 
reject mental events as legitimate scientific concepts. More re- 
cently, however, developments in cognitive science and artificial 
intelligence, in which mental events are specifically modeled and 
found to have measurable correlates in behavior, have brought the 
concepts back into fashion. These mental events, their description 
and postulated interrelationships, are the subject of this report. 
We focus specifically on the mental events that are postulated to 
occur as someone learns or performs complex tasks on computer 
software. 

From the point of view of cognitive science, users of computer 
software systems base their behavior on stored knowledge about 
particular sequences of actions, on general rules about how to 
accomplish certain tasks, or on a mental model (an underlying 
understanding of how the system works). Knowing what the user 
knows about or expects from a system has implications for both 
design and training purposes. From a design point of view, the 
system could be designed to fit the user’s goals in accomplishing 
tasks or could display enough of how it works to make accomplish- 
ing a task easy to understand. From the training point of view, 


users could be given instructions and exercises that clearly present 
sequences, rules, and/or a model in order to make learning and 
performing easy. 

At present, there is no satisfactory way of describing what 
the user knows. There m no way to characterize the differences 
among users of various systems as they go through the process 
of developing an awareness and understanding of how the system 
works or how a given task is to L? performed. Consequently, the 
Committee on Human Factors conducted a two-day workshop on 
May 15 and 16, 1984, to determine means for achieving a better 
understanding of what users know and its implications for system 
and software design as well as user training. This workshop was 
a continuation of the committee’s efforts to define research needs 
in the area of software human factors. Ten nationally known 
researchers on software design, cognitive psychology, and human 
factors met to discuss the issues having to do with what a user of 
software knows. 

t background for this workshop, John M. Carroll wrote an 
invited paper entitled “Mental Models and Software Human Fac- 
tors: An Overview.” This was distributed to all participants in 
advance of the meeting. In turn, the workshop members prepared 
short two- to three-page position papers addressing additional top- 
ics and issues that they believed were important and warranted 
discussion at the workshop. Much of the discussion at the work- 
shop centered on sifting through the many definitions of the term 
menial model , gathering ideas from among the variety of methods 
used to represent users’ knowledge about software systems. 

This report was prepared by merging the ideas generated by 
the workshop members with those in Carroll’s paper. It includes 
his central organization and literature review, adds more recent 
information, and clarifies the distinction between mental models 
and task representations. This report was then distributed to 
workshop participants for changes and additions. 

This report is written for the researcher concerned with the 
psychology of performance of complex tasks and for the prac- 
titioner who would like to use information about how the user 
thinks about both the task and the system in the design of com- 
puter software, its documentation, or training for its use. Most 
of the research on these questions has used software-based text- 
editing tasks as a domain and looked at the mental models people 
are purported to build of only simple devices. The results should be 




generalized to even more complex tasks, such as process control, 
tactical decision making, project planning, and graphics design; 
but their scope has not been tested. The exclusion of these kinds 
of tasks is not to be taken as an indication that the research re- 
ported cannot cover these more complex tasks. But their scope is 
an important research need. 


Judith Reitman Olson 




Abstract 




I Users of software systems acquire knowledge about the system 

| and how to use it through experience, training, and imitation. 

| Currently, there is a great deal of debate about exactly what users 

| know about software. This knowledge may include one or more of 

j the following: 

' • simple rules that prescribe a sequence of actions that apply 

( * under certain conditions; 

• general methods that fit certain general situations and 
goals; and 

! • mental models, knowledge of the components of a system, 
their interconnection, and the processes that change the 
components; knowledge that forms the basis for users be- 
ing able to construct reasonable actions; and explanations 
about why a set of actions is appropriate. 

Discovering what users knew and how these different forms 
of knowledge fit together in learning and performance is impor- 
tant. It applies to the problem of designing systems and training 
programs so that the systems are easy to use and the learning is 
efficient. Research on the effects of different representations on 
ultimate performance is mixed. Research on exactly what users 
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know is scattered. Analytical methods and techniques for repre- 
senting what the user knows are sparse but growing. 

This report reviews current work and through the review, 
identifies several important research needs: 

• Detail what kinds of mental representations people have of 
systems that allow them to behave appropriately in using 
the software. 

• Detail what a mental model would consist of and how a 
person would use it to decide what action to take next. 

• Produce evidence that people have and use mental models. 

• Determine the behaviors that would demonstrate a mental 
model’s form and the operations used on the model. 

• Explore alternative views of goal-directed representations 
(e.g., so-called sequence/method representations) and de- 
tail the behavior predicted from them. 

• Expand the types of mental representations that may exist 
to include those that may not be mechanistic, such as 
algebraic and visual systems. 

• Determine how people intermix different representations in 
producing behavior. 

• Explore how knowledge about systems is acquired. 

• Determine how individual differences have an impact on 
learning of and performance on systems. 

• Explore the design of training sequences for systems. 

• Provide systems designers with tools to help them develop 
systems that evoke "good” representations in users. 

• Expand the task domain of this research to include more 
complex software. 


Mental Models in 
Human-Computer Interaction: 
Research Issues About What the 
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INTRODUCTION 

"" Discovering what the users of a computer software system do 
know and should know are important goals in current research on 
human-computer interaction. Research on the kinds of knowledge 
people have when they use computers, including the concept of 
a mental model of the system, is one of the major topics that is 
bringing the field of human-computer interaction from the tra- 
dition of human factors closer to that of experimental/cognitive 
psychology. Traditional human factors work has focused principal 
attention on behavior and performance itself, and has avoided the 
problem of describing the conceptual causes and effects of that 
behavior. On the other hand, while academic cognitive psychol- 
ogy does concern itself with theoretical interpretations of mental 
processes, it has focused on narrowly restricted mental processes, 
such as particular aspects of learning, memory, problem solving, or 
planning, and has studied them in the context of highly controlled 
and contrived laboratory tasks.- The study of knowledge represen- 
tations of users of computer-based systems affords an opportunity 
to explore both the theoretical base of behavior as well as specific 
behaviors in tasks that involve many different cognitive processes 
in concert. 

Because a number of researchers are concerned with mental 
representations, and because this topic has an impact on cognitive 
psychology and software human factors, there is an emerging need 
to clarify the concepts underlying knowledge representation and 
mental models as they apply to human-computer interaction. We 
intend to filU this need by reviewing relevant current research 
and presenting a preliminary framework of the kinds of mental 
representations of procedures people might have. ^ — 


MODELS OF WHAT, HELD BY WHOM? 

Several key distinctions need to be recognized in discussing 
mental representations and mental models in human-computer in- 
teraction. For example, various individuals are concerned with 
using or designing a piece of software, and they hold different 
conceptions of it. These individuals include the user, the software 



3 


Preceding page blank 





i 


engineer, the human factors analyst, and the cognitive psycholo- 
gist. Furthermore, there are different aspects of the system to be 
known: the task , knowing what the goal is and in general what 
subtasks need to be accomplished to achieve the goal; the system 
interface , knowing how to accomplish the sequence of subtasks in 
this system, given the data presentation and interaction languages 
of this system; and the system architecture , knowing the way the 
data are stored, the internal processes the interactions invoke, and 
in general how the system works. 

Confusion has surrounded the term mental model because 
different authors have referred to different owners of the models 
(the user, the software engineer, etc.) and are not clear as to what 
the model actually represents (the task, the architecture, etc.). 

For example, some researchers and human factors analysts 
acknowledge that it is important to know the way users themselves 
are built and work, what their memory limits are, their common 
strategies in problem solving, their individual differences, and so 
on, in order to build useful, usable software. A system that requires 
the user to remember a list of 100 codes that represent areas of the 
country or the types of transactions that are required (as in some 
airline or automobile reservation systems) is predictably difficult 
because our model of the user includes a long-term memory that 
is confused by similar meaningless items. These researchers have 
sometimes used the term mental model to refer to the model that 
they, as researchers, have of the user’s mental architecture. 

Similarly, software engineers have ideas about what the user 
wants to do and how the system itself is structured that dictate 
how they will program the system and how it will operate to serve 
the users’ needs. Engineers have mental models of their design. 

This highlights another distinction, that between descriptive 
and prescriptive representations. Researchers want to be able to 
analyze what the user currently knows so they can explain why 
he or she is having difficulty, which aspects are learned and which 
are confused, and so cn. In this case, they are using a descriptive 
model, one that tells us what the user knows. Designers, however, 
want to construct a model of what the user should know. This 
representation could be used to analyze, for example, whether a 
proposed system will be too difficult to learn or where the errors 
might be. And, in designing commands and screen presentations, 
designers would like to invoke a model in the user that fits the 
dialog; they would like to get the user to build a mental model 
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of the system that fits what the users have to do to operate the 
system. Descriptive models are those held by the researcher to 
approximate what the user does know; prescriptive models are 
those held by the designer to approximate what the user should 
know. 

The concern of this report, however, is the representation that 
the user has of how a computer system works. Furthermore, since 
a mental model may be only one way of describing the knowledge 
that a user has about a system, this report is broadened to include 
all of what a user knows about using a particular piece of software, 
including how to use it and how it works. 

What users know differs in several important dimensions. It 
differs according to the sophistication of the user. For example, a 
user who is a programmer might have a very different understand- 
ing of a piece of software than a person with no programming ex- 
perience. Also, multiple mental models or several representations 
at different levels of abstraction might coexist within the same 
individual. For example, a person who both designed and later 
used a system might develop two somewhat compartmentalized 
understandings of the system. Analogous distinctions arise if we 
consider different task environments. For example, the representa- 
tion elicited for routine skilled behavior might differ substantively 
from that elicited when a person tries to recover from an error or 
otherwise solve problems (e.g., Rasmussen, 1983 ). 

Because understanding what the user knows has practical 
importance for designing software and its training, and because 
it has theoretical importance in understanding people as they 
generally perform complex cognitive tasks, this report considers 
only the representations the users have when using software — 
representations of the task being performed, the user-system in- 
terface, and the system architecture. 

TYPES OF REPRESENTATIONS OF 
USERS’ KNOWLEDGE 

There are three basic types of representations that have been 
formulated to characterize what a user of software knows. The 
most elementary is a simple sequence of overt actions that fit a 
particular situation. The second is a more complex and general 
characterization, the knowledge of methods. This kind of rep- 
resentation of the user’s behavior incorporates general goals, the 
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subgoals associated with it, a set of methods that could be brought 
to bear to accomplish the subgoals, and, finally, sequences of op- 
erators for those methods. Both of these conceptualizations are 
task-oriented in that they contain no theory of how the software or 

system works or what the user’s actions do internally to produce 
the results. 

The third, the mental model , 1 is knowledge of how the system 
works, what its components are, how they are related, what the 
internal processes are, and how they affect the components. It is 
this conceptualization that allows the user not only to construct 
actions for novel tasks but also to explain why a particular action 
produces the results it does. 


Simple Sequences 

Users often have no knowledge of the underlying system or 
even genend rules for getting things done. Novices, in particular, 
resort to a learning method that borders on rote memorization, 
"hey *®arn sequences of actions that will get the system to do 
co mxn°n types of tasks. For example, in using the operating system 
on the Michigan Terminal System to print the contents of a text 
hie with the laser printer, many users merely memorize the nearly 
nonsense strings: * 

$RUN *textform scards = pc:fw.macros + file spunch = -x 
run a program called 'i textform * with input from a master file 

fiiZTJX- ? Iie inputfiU ' “ ni “ e Mf '“ <» « 

$RUN *pagepr scards = -x par = onesided 

Vun^apmjram called “pagepr” with input from the temporary 

file x so that the output ts printed on only one side of each 
page' J 

par L am f er to entered is the name of the 

foL w W + m the . fir8t 8Cards " designation. Similarly, 
some word processors require the user to memorize short, common 

, a * ub * et of tha knowledge Route and Morris fl986l mil 

model.. We would include knowledge that hXTe u.er to .llT. A 

would ^oTi V H te# J° f *h® . ,y,tem and t0 i‘* future behavior. We 

would not include detcnptions of its purpose and form, information that 

eems shallow and unhelpful in a performance context. 
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command sequences to accomplish certain repetitive actions, such 
as “<cntl> XME” to exit, and “<cntl> XLA” to enact the printing 
sequence. A good clue as to how often users rely on these simple 
sequences is to note the cheat sheets that they keep available when 
they are using software, or the notes made and often stuck to the 
side of the cathode-ray tube to remind the user of some commands 
that are commonly used but difficult to remember. 

Young (1983) described one way in which users think about 
a calculator, as simple sequences or sets of task-action pairs. A 
task includes something the user wishes to accomplish (e.g., an 
arithmetic calculation or formula evaluation), which is associated 
with an action, or what the user must do in order to accomplish 
the task (e.g., key presses on a calculator). This knowledge is 
in the form of paired associates, and like the sequences to print 
a file described above, it has simple slots that indicate the free 
parameters the user must designate to fit the current situation. 

A second description of simple sequences of actions is the 
keystroke model (Card et al., 1980a, b, 1983; Embley et al., 1978). 
The analyses in the keystroke models contain notations that de- 
scribe what sequences of actions users make in invoking simple 
commands: the keystrokes, mouse movements and so on. In Card 
et al. (1980a, b, 1983) keystroke analysis, the analyst assumes that 
the user needs time to make each act in producing the command: 
a time to make a keystroke, a time to point with a mouse, a time 
to move the hands from the keyboard to the mouse or back, and a 
time to mentally prepare each command and its parameters. The 
analy sis assumes that users must retrieve each command sequence 
from their memory, incurring a pause for mental preparation, and 
then execute the components of the command, pausing for addi- 
tional mental preparation times before each command word, each 
parameter, and each delimiter (such as pressing a parenthesis, 
return, or other type of operator). For example, a command se- 
quence for using a line-oriented editor to search a file for an error 
and fix it: 

a /f “errorstring” 

'search the whole file for an error' 

a 16 “oldstring” newstring” 

‘alter line 16 so that the old string is replaced with the new 
string ' 




would include mental preparations before each line and before each 
parameter, such as “/P and “16,” and the strings to be searched 
for and replaced. Analysis proceeds by attaching a constant time 
for each keystroke, movement, or mental preparation, affording 
a prediction of how long the formulation and execution of each 
command would normally require. 

In the same spirit, Reisner (1984) assumes that the user needs 
a fixed amount of time to make each individual act in producing 
a command. Instead of one mental preparation time, however, 
Reisner (1984) posits specific mental acts (e.g., retrieving from 
long-term memory, calculating a number, copying a number), each 
of which takes a different length of time. The analyst Manm«»« (or 
knows from prior experimentation) how the various parameters 
are related (e.g., the time to calculate a number will be greater 
than the time to copy that number from a display) without spec- 
ifying each time exactly. Simple algebra is then used to predict 
which of various whole design alternatives, or which of various user 
methods, will require the shortest time to perform. 

These analyses of simple sequences serve to facilitate both 
comparison of existing software packages for the one that will re- 
quire the shortest time to perform and the design and development 
of new system languages. 


Methods and Ways to Choose Among Them 

Users not only elicit simple sequences to fit simple situations 
by rote; they sometimes also choose among various possible general 
methods that fit a particular situation. 3 

A number of investigators have studied the organization of 
more general actions as a function of task goals in the domain 
of programming. A general finding is that skilled programmers 
recognize aspects of particular situations and select general ac- 
tions appropriate to them. For example, individual statements 
or sets of lines of code in a program are “chunked” into higher- 
order task-relevant structures. Skilled programmers can recall at a 
glance more lines of code than novice programmers (Adelson, 1981; 
McKeithen et al., 1981; Shneiderman, 1980). This is consistent 


Those methods are similar to the procedures remembered and used 
in the stage of *deciding and testing actions” in supervisory control tasks, 
described by Sheridan et al. (1986). 
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with prior studies of expertise and the organization of memory 
(Chase and Simon, 1973; Egan and Schwartz, 1979; Reitman, 
1976), These studies suggest that in the skilled programmer’s 
knowledge base there is a mapping between chunks of actions or 
methods (that often go together) and general task features, so that 
the actions will be recalled and used at appropriate times in the 
future. These chunks reflect a developed, deeper understanding 
c outine programs, which are useful to a programmer writing 
pro 0 *ams. Similarly, Ehrlich and Soloway (1984) have shown that 
skilled programmers tend to employ patterns of actions, called 
plans, consisting of routinely occurring sequences of programming 
statements. 

Furthermore, by examining the structure of recall protocols, 
McKeithen et al. (1981) determined that skilled programmers or- 
ganize their vocabulary of programming statements more stereo- 
typically than do novice programmers. It appears that with ex- 
pertise, the users’ understanding converges to a similar set of 
representations of concepts in the programming language. Data 
base designers reveal mental organizations that become increas- 
ingly homogeneous with greater expertise (Smelcer, 1986). 

A more complete theory about what the user knows about 
how to accomplish a particular task is the GOMS model (Card et 
al., 1983). GOMS is an acronym that stands for the elements of 
what the user knows: the goals, the operators, the methods, and 
selection rules. In the GOMS model, the user has a certain goal 
to accomplish (such as editing a manuscript that has been marked 
up). The user recognizes that this large goal can be broken into 
a set of subgoals (such as finding each editing mark and making 
the requisite changes). Subgoals are broken down into smaller and 
smaller subgoals until they match a basic set of methods, that is, 
sequences of operations that satisfy a small subgoal. 

The GOMS model states that users have some rules by which 
they choose the method that will fit the current situation. For 
example, users may know that there are several methods that can 
be used to find the first place in the manuscript to be edited: using 
the search function with a distinguishing string to be found, using 
the page- forward key until the target page is found visually, or 
using the cursor key to find the specific target location visually. 
People will choose whether to use the search, page- forward, or 
cursor key method depending on how far away the next editing 
target is assumed to be. Each of these methods is made up of 
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certain operators, key presses, and hand motions, as specified in 
the keystroke model described above in the discussion of simple 
sequences. 

A number of empirical studies have shown that the predictions 
of GOMS and the keystroke model are reasonably accurate, and 
that sometimes one can even use the same time parameters across 
applications. Card et al. (1983) showed that their parameters for 
keystrokes and mental processing time were similar across text 
processors, operating systems, and graphics packages. Olson and 
Nilsen (1987) extended the analysis to show that the basic param- 
eters applied well to spreadsheet software. However, additional 
time parameters were required. One was to account for the time 
it took users to scan the screen (for example, to find on the screen 
the coordinates of a particular value in a spreadsheet). A second 
time parameter was required to account for the time it takes the 
user to choose between methods: the more methods to choose 
from, the longer the pause before executing a simple sequence in 
a command. 

Command grammars use a different analytic representation, 
but are analyzing the same kinds of mental events. The command 
language grammar (CLG) (Moran, 1981) and Backus normal form 
(BNF) (Reisner, 1981, 1984) have been used to describe the orga- 
nization of sequences of actions that fulfill goals. These grammars 
are^sets of rules that show the different ways in which an “alpha- 
bet” of actions can be formed to produce acceptable “sentences” 
that are understandable to a system or a device. 

For example, Reisner (1981, 1984) treats user actions that 
are acceptable to the system as a language. She describes the 
structure of this language as a BNF grammar. Figure 1 shows a 
sample of what in this formalism are called rewrite rules. At the 
higher levels are the user’s task goals and the possible methods that 
can achieve the goal. This is presumably a representation of the 
components of plans the user has ready to evoke to fill an overall 
task goal. Below these are the varieties of action sequences that 
can be elicited in a method. The top several lines of Figure 1 are 
similar to the goals/subgoals and methods of the GOMS analysis; 
the lower levels are similar to the keystroke model sequences. 

Compared to GOMS, this representation more compactly 
shows the alternative ways to accomplish a task or to enact a 
series of keystrokes; GOMS requires a new method for each al- 
ternative. While various methods (represented as sentences from 
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Use Dn 


I Identify first line 

I Get first line on screen 

I “Locate” strategy ..>Move cursor to comma. id Input field + type 

I “locate” command + press ENTER 

I 

I Move cursor to command Input field 

| .^Use cursor keys press PFCURSOR null 

Type locate command ..>Type “locate” keyword + type line number 

5 Type locate keyword „>L+0+C L L+O+C+A+T+E 

Type line number ~>Type number 

FIGURE 1 A command grammar representation of action*, necessary to 
edit a line using a word processor. Rewrite rules applied to this domain are 
compact definitions of the many acceptable ways to get somethin" done in a 
particular command language. One reads these rules from left to right; the 
left-hand terms are made up of the elements listed on the right-hand side. 
Elements connected by a “+* are executed in sequence, elements connected 
by a * — * represent alternative ways of invoking the same goal. For example, 
"Use Dn* consists of identifying the first line, then entering the “Dn* 
command, and then pressing enter. Typing the locate keyword, however, 
includes typing “LOC,* “L," or “LOCATE." Source: Reisner (1984:53). 


such a grammar) can be compared to see which *akes less time, a 
grammatical representation is less adequate than GOMS in that 
it lacks any way to represent how a user selects the method appro- 
priate for the current situation. 

The language format of grammars, however, allows the use of 
standard sentence complexity measures to predict some aspects 
of user behavior: the more rules, the long it takes a user to 
learn; the greater the sentence (sequence) complexity, the longer 
the pauses between keystrokes; the more terminal symbols in the 
language, the harder the language is to learn. These predictions 
have not been fully tested, and there is some suggestion in the 
literature about language understanding that these measures do 
not adequately predict how difficult it is to understand sentences 
(Fodor et al., 1974; Miller, 1962). The formalism, however, allows a 
number of intriguing predictive possibilities for understanding and 
J recalling command languages. See Reisner (1983) for a discussion 

i of the potential value of such grammars. 


.^Identify first !<ne + enter Dn command + 
press ENTER 

„>Get first line on screen + Move cursor to 
first line 
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Mental Models 


In its most generic application, the term mental model could 
be applied to any set of mental events, but few if anyone would 
claim such meaning for the word model. Somewhat narrower in 
meaning, the term could be used for any thought process in which 
there are defined inputs and outputs to a believable process which 
operates on the inputs to produce outputs. In this sense, one 
could have a mental model of one’s own behavior (“If I do this, 
then that will happen”), another person’s behavior, the input- 
output characteristics of any software process run on a computer, 
or any information process mediated by people or machine. It 
could be a series of paired associates by which the user predicts, 
through a causal chain, outputs of a process given its inputs. 

Given these general possibilities for the term mental model, it 
is most commonly used to refer to a representation (in the head) 
of a physical system or software being run on a computer, with 
some plausible cascade of causal associations connecting the input 
to the output. Accordingly, the user’s mental model of a system is 
here defined as a rich and elaborate structure, reflecting the user’s 
understanding of what the system contains, how it works, and why 
it works that way. It can be conceived as knowledge about the 
system sufficient to permit the user to mentally try out actions 
before choosing one to execute. A key feature of a mental model 
is that it can be “run” with trial, exploratory inputs and observed 
for its resultant behavior (Sheridan et al., 1986). 

Mental models are used during learning (such as using an 
analogy to begin to understand how the system works), in problem 
solving (such as in trying to extricate oneself from an error or 
performing a novel task), and when the user is reflecting on 
attempting to rationalize or explain the system’s behavior. 

Users are typically described as using a mechanistic model; 
that is, the user is assumed to have a conceptual “machine” whose 
simulated function matches the actual target machine in some 
way. 3 Three general kinds of models are called surrogates (Young, 
1983), metaphors (Carroll and Thomas, 1982), and glass boxes 
(DuBoulay et al., 1981). A fourth kind of model, the network 

3 Thi» may be more due to the fact that reaearchera are good at 
deacribing mechaniatic modela than to the fact that it is the only kind 
of model people have. In fact, exploration of other repreaentationa ia an 
important research need. 
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model, is a composite, blending the features of surrogates and 
glass boxes. 


Surrogates 

A surrogate is a conceptual analysis that perfectly mimics the 
target system’s input/output behavior and that does not assume 
that the way in which output is produced in the surrogate is the 
same process as that in the target system. It is a system that 
behaves the same, but is not assumed to be isomorphic in its inter- 
nal workings. Thus, while the surrogate always provides the right 
answer (the one that the target system would have generated), it 
offers no means of illuminating the real underlying causal basis 
for the answer. It is a good, complete analogy that may allow the 
user to construct appropriate behavior in a novel situation, but it 
does not help the user explain whey the system behaves the way 
it does. 

Young (1983) noted that it is very difficult to construct an 
adequate surrogate, even for a fairly simple system like a hand- 
held calculator. This raises the question of whether people ever 
hold surrogates in their minds, even for simple devices. 


Metaphor Models 

A metaphor model is a direct comparison between the target 
system and some other system already known to the user. A com- 
mon example, referred to widely in the literature, is the metaphor 
that “a text editor is a typewriter.” Many investigators have 
observed that new users spontaneously refer to this typewriter 
metaphor during early learning about text processors (Bott, 1979; 
Carroll and Thomas, 1982; Douglas and Moran, 1983; Mack et al., 
1983). The explanations people offer for system behavior are often 
couched in the vocabulary of the metaphor. Furthermore, the ex- 
tent to which knowledge in the metaphor source domain matches 
the target domain correlates with performance. That is, the task- 
action pairs that fit both the metaphor source and the target 
system are easy to learn; those that do not are often learned last 
or remain constant sources of error. For example, learners have 
less trouble learning how to use character keys than the backspace 
and carriage return keys; the latter typically operate differently in 
text processors than they do in typewriters. 
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was taught to the user (e.g., subjects were instructed to think of in- 
put as a ticket window). Studies of prescriptive conceptual models 
tell us something about what kinds of models are useful, and about 
models that people could generate. On the other hand, they can 
validate prescriptive models that help users of complex systems 
when it is hard for the user to deduce an adequate representation 
merely from experience. 


Network Representations of the System 

Network representations contain the states a system can be 
in and the actions the user can take that change the system to 
another state (Miller, 1985). One particular type of network rep- 
resentation, the generalized transition network (GTN), contains 
detailed descriptions of what the system does (Kieras and Poison, 
1983). GTN’s are state transition diagrams that represent the vis- 
ible states of the system (i.e., the display on the screen) as nodes, 
and the actions the user can take at each state (the commands or 
menu choices) as arcs. The connected nodes and arcs form a net- 
work that shows the sequence of states that follow user actions at 
each point in the software interaction. GTN’s and other network 
diagrams are often used as tools in system development, to give 
the designer a picture to refer to in order to keep track of what 
can be done at every state in the transaction. Figure 2 illustrates 
a portion of one of these networks for the actions that can be 
taken when a user enters a system and loads the word processing 
application. 

Networks can also be used to describe what the user knows 
about the system (Olson, 1987). Olson (1987) suggests that GTNs 
be used to represent users’ knowledge of system states and allow- 
able actions; these c m be compared to the GTN of the actual 
target system to measure the user’s level of learning or under- 
standing. Examination of the parts of the real GTN that are 
missing in the user’s representation could indicated areas in which 
learning or remembering certain functions is difficult. 

The GTN is like a surrogate representation in that it does 
not give an underlying explanation about why the elements are 
related in the way that they are nor how the internal system com- 
ponents behave. Nor is there any indication of the purpose these 
actions fill toward a user’s goal. It does, however, represent what 




15 



WAIT FOR KEYSTROKE ^ 

keyZkeyst^E 

FIGURE 2 A generalised transition network (GTN) representation of part 
of the task of editing a document. Circles represent states or tasks, arcs 
represent the connections between states, and labels to the arcs represent 
the actions the user takes. Source: Kieras and Poison (1983:104). 



the user knows about how the system works in simple stimulus- 
reponse terms. A GTN displays the simple response that can be 
expected from the system given each action the user takes. And, 
importantly to the user, knowledge of these actions and their con- 
sequences can be useful when the user must solve problems, either 
when an error has just occurred or when a novel goal has arisen 
and the user needs to decide on an appropriate sequence of actions. 


Comparisons 

It is useful to consider the relation between sequence/method 
representations and mental models. People undoubtedly have both 
kinds of knowledge when they use computing systems. But re- 
search on these two approaches is largely complementary in that 
the kinds of questions addressed about one kind of representa- 
tion have been different from those about the other. Briefly, the 
sequence/method representations are more aalytic in that they 
can predict behavior (except errors) in some detail. Although the 
sequence/method approach has not typically dealt with predicting 
user errors, attempts have been made to show how user learning 
takes place. The mental models approach, on the other hand, 
accounts for errors as well as accurate behavior in novel and stan- 
dard situations, but does not predict the details of behavior well 
nor how the models are learned. 

Sequence/method representations, because they are composed 
of goal-action pairs, by their very nature predict how knowledge 
is used. To date they have represented only how to accomplish 
routine tasks (in which all the goal-subgoal and subgoal-action 
relations have been worked out) but have little or nothing to 
say about how knowledge is used in nonroutine tasks, such as 
in recovering from an error or behaving in an entirely unfamiliar 
situation. They do not have much generality in their conditions. 
And, there is no posited mechanism for problem solving when a 
new situation fits several general condition-action pairs. Without 
this mechanism, these analyses cannot account for errors. 

Some attempt has been made to account for how sequence/method 
representations are learned. Lewis (1986) provides an account of 
how users might acquire goal-action knowledge after they watch 
another person use the system. Through several simple heuristics 
that link actions to probable causes, the user begins to build a 
reasonable set of rules. The acquisition of rules is detailed by 
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Lewis (1986), but the further learning in fine-tuning those rules 
is not covered. Kieras and Bovair (1986) do not explain original 
learning per se, but have shown that learning a new system is 
speeded up if the user is familiar with another system that has 
many of the same rules in common. Neither of these approaches 
addresses the continued learning that goes on as the user acquires 
or discovers new strategies for efficiency. 

Research on mental models, on the other hand, has not con- 
centrated on the details of how a user uses a mental model nor how 
it is acquired. Douglas and Moran (1983) have produced the most 
detailed analysis of the behavior of a user who has a mental model. 
They examined the analogy of “a text processor is a typewriter” 
by noting the typewriter condition-action sequences that matched 
and mismatched those in the new system. Those condition-action 
pairs that matched were learned easily and quickly, and those 
that did not match produced continued errors and pauses. Other 
researchers have attempted to make the analysis of the behavior 
of the user who has a mental model more specific and revealing 
(Foley and Williges, 1982; Moran, 1983; Payne and Green, 1983). 
What is missing from these analyses, however, is how users use 
their mental models to come up with a set of appropriate actions. 
There are likely to be some very interesting cognitive actions going 
on in the pause between the presentation of the problem (e.g., the 
feedback from the screen after an error) and the choice of the next 
action. 

Most of the empirical work on the effectiveness of mental 
models and the predictive power of sequence/method analyses has 
been at a gross behavioral level. The studies of experts’ chunking 
of information (Chase and Simon, 1973, for example) are almost 
completely empirical; they focus entirely on the acquisition of the 
condition part of a condition-action pair and offer little basis for 
theory. The grammatical approaches often hold a key assumption: 
that the fewer actions there are per task, the cognitively simpler 
the task. Recent work has raised questions about the accuracy of 
this assumption (Olson and Nilsen, 1988; Rosson, 1983). There 
are occasions when a task has a few actions, but the planning and 
calculating necessary to make those actions is difficult. 

Moran has described a number of connections and contrasts 
between sequence/method and mental model approaches. The 
GOMS analysis (a methods analysis) and CLG (a blend of method 
and mental model) sprang frotu common theoretical roots. Indeed, 


GOMS can be viewed as a simplified and more parameterized, 
compiled CLG. Moran (1981), however, stresses two contrasts. 
First, where CLG incorporates a limited mental model of the 
system in its semantic level, GOMS incorporates no mental model 
whatsoever. GOMS incorporates only the knowledge required to 
perform a task. Second, where the focus of CLG is the functional 
description of various levels of user knowledge and the mappings 
between these levels, the focus of GOMS is the sequencing of 
operators and the time requirements for each. The bottom line for 
GOMS is predicting performance times. 

Kieras and Poison (1983) simulate users’ behavior on partic- 
ular computer systems. They have two representations in their 
simulations, which with an additional twist can be viewed in much 
the same spirit as Moran’s (1981) view of the relation between 
CLG and GOMS. In the Kieras and Poison (1983) model, a job- 
task representation describes the person’s understanding of when 
and how to carry out tasks (very much like GOMS). The simulated 
user’s behavior is responded to by a simulation of the system, a 
device representation, which is a GTN of the states and transi- 
tions between them in the system. Some knowledge of this sys- 
tem behavior, a mental GTN, can represent what the user knows 
about the system — a thin, surrogate mental model. The former 
GOMS-like representation is the user’s knowledge that produces 
performance, while the latter, the mental GTN, could be the user’s 
theory during learning, problem solving, and explaining how the 
system works. 


HOW USERS’ KNOWLEDGE AFFECTS 
THEIR PERFORMANCE 


The discussion up to this point has treated what the user 
knows as a static structure. While we have alluded to its un- 
derlying role in behavior (learning, problem solving, explanation, 
skill), we have not focused on these behavioral processes per se. 
Nevertheless, this aspect is critical both to assessing the empirical 
content of current analyses and to determining how these analy- 
ses might be applied to practical problems like the design of user 
interfaces and training materials. 


Chaos and Misconception in Both Novices and Experts 


Learning involves internalizing, constructing, or otherwise st- 
aining a representation of the system being learned. How does 
this process proceed and what are its early results? The summary 
picture is of a halting and often somewhat nonconvergent pro- 
, 8 ° Iving and “mention (e.g., Bott, 1979; Mack et 
“ > 1983 i Rumelhart and Norman, 1981). Indeed, the models that 
learners spontaneously form are incomplete, inconsistent, unstable 
in time, overly simple, and often rife with superstition. 

A person may develop an understanding that is adequate for 
simple cases but that does not extend to more complex cases. For 
example Mayer and Bayman (1981) found that users of calculators 
often believed that evaluation only occurs when the equals key is 
pressed. Scandura et al. (1976) describe a student who concluded 
that the equals key and plus keys on a calculator had no function 
th eycaused no visible change in the display. Norman 
< 1983 J describes learners who superstitiously pressed the clear key 
on calculators several times, when a single key press would do. 
reople learning to use a simple programmable robot developed 
wrong analogical models of its behavior that they accepted without 
testing until the models failed to predict the actions the robot 
took (Shrager and Klahr, 1983). Mantei (1982) found that users 
performing a task in a menu-based retrieval system developed and 
mamtained simplistic sequences of actions that were eventually 
ineffective m accomplishing their search goals. 

Chaotic and misconceived conceptual models are not merely 
an issue cf early learning and something that users outgrow. Expe- 
rienced users hold them as weU. For example, Mayer and Bayman 
(1981) asked students to predict the outcomes of key press se- 
quences on a calculator. Even though all of the students were 
experienced in the use of calculators, their predictions varied con- 
siderably For example, some predicted that an evaluation occurs 
immediately after a number key is pressed, some predicted that 
evaluation occurs immediately after an operation (e.g., plus) key 

is pressed, and some predicted that an evaluation occurs immedi- 
ately after equals is pressed. 

Rosson (1983) found that even experienced users of a text 
editing system often had rather limited command repertoires, rou- 
tinely employing nonoptimal methods (such as making repeated 
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local changes instead of a single global change). Even in large pow- 
erful systems, most of the activity involves the use of only a very 
small portion of the system. In the case of UNIX, for example, 20 
of the available 400 commands accounted for about 70 percent of 
the usage (Kraut et al., 1983). Like the Mayer and Bayman work 
(1981), this suggests that even an extensive amount of experience 
does not necessarily lead the user to a complete, consistent, or 
even correct conceptual model. There are some things about a 
system that most users never learn. 


Skilled Performance 

Human performance analyses have been well developed in ve- 
hicular control (e.g., aircraft, ship, automobile) and target pursuit 
tasks. Many of these analyses explicitly hypothesize a mental 
model of the system being operated (e.g., Baron and Levison, 
1980; Jagacinski and Miller, 1978; Pew and Baron, 1983; Veld- 
huyzen and Stassen, 1976). In these cases, the mental model is 
used to anticipate the response of a dynamic system and hence to 
overcome the deleterious effects of time delays either from other 
humans or hardware. These models have produced good descrip- 
tions and predictions of human performance. 

Because these models deal with spatio-temporal trajectories, 
their applicability is limited to continuous detection and movement 
tasks. In contrast, episodic models of movement that incorporate 
an additional, abstract level of description in terms of discrete 
situation-action pairs have much in common with goal-action mod- 
els in human-computer interaction. Discrete representational and 
data reduction techniques developed for episodic skilled perfor- 
mance (Jagacinski et al., in press; Miller, 1985) may prove useful 
in the domain of human-computer interaction. Software user tasks 
do, however, typically involve a larger set of situation-action pairs 
than is covered in human performance analyses, and they proba- 
bly involve more varied categorization and planning by the human 
operator. Whether they can be generalized to the greater cogni- 
tive complexity of human-computer interaction tasks is an open 
question. 

If we assume that knowledge of simple sequences is in the 
form of goal-action pairs, then we should be able to apply what we 
know from traditional verbal learning studies about the retention 
of paired associates (e.g., Hilgard and Bower, 1975; Postman and 


Stark, 1969) to predict which systems will be easy to learn and 
what kinds of errors will occur. For example, presumably, those 
systems that have few paired associates to be learned or those that 
have distinct, nonconfusable goal-action pairs will be easy to learn 
and remember. 

Landauer et al. (1983), Barnard et al. (1981), and others have 
explored certain aspects of this issue with mixed results. Lan- 
dauer et al. (1984) discuss the difficulties of constructing command 
names that are natural, that is, those that would have existing 
goal-action paired associates in memory and ready to transfer eas- 
ily to a new task. They argue that if one incorporates command 
names generated by naive users, these names are natural but often 
are not distinctive enough to allow users to keep from getting them 
confused among each other. Preexisting paired associates can help 
transfer, but if they are not distinct paired associates as a set (e.g., 
A-B may be good until it must be learned along with A-C), the 
confusion can offset any positive effect from their naturalness. 

Poison and Kieras (1984, 1985) embody the GOMS model in a 
production system-based simulation of users’ behavior while using 
software. This is a very concrete representation of what the user 
knows when performing well-learned tasks and has a number of 
confirmed behavioral correlates. Their analyses postulated that 
the number of productions (the number of rules needed to decom- 
pose goals into subgoals, to find methods to fit the subgoals, and to 
execute the sequence of actions in a method) necessary to perform 
a task is a good predictor of the time it takes to learu a system, 
that the number of productions that two systems have in common 
predicted the ease of learning the second after the first, that the 
number of productions used in constructing the next overt action 
predicted the delay from one overt action to the next, and that the 
number of items held temporarily in a working memory predicted 
the likelihood of errors or delays (Kieras and Bovair, 1985; Kieras 
and Poison, 1985; Poison and Kieras, 1984, 1985; Poison et al., 
1986), Some of the predictions afforded by this specific analysis 
have been successfully tested; others are being tested now. 

Though this approach is to be lauded for its specificity and 
the accuracy of some of its predictions, its weakness lies in de- 
termining how one counts the number of productions required for 
a task. Since production rule formalisms are general program- 
ming languages, a single function can be programmed in many 
ways. Consequently, for purposes of replicability, it is important 
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for Kieras and Poison (1985) to specify further what production 
language style underlies these production analyses, and further, 
whether this style can be argued to be consonant with a the- 
oretically reasonable model of the architecture in which human 
information processing operates. 

The chief limitation of the GOMS analysis is that it considers 
only error-free performance. This is a serious limitation since even 
skilled users spend at least a quarter of their time making and 
recovering from errors. In GOMS, goals are very specific to task 
situations; they are not currently in a general form (Card et al., 
1983). This is not a limitation in principle, however, since GOMS 
is a deliberate simplification of Newell and Simon’s (1972) general 
problem solver, a model that is general enough to describe any 
goal-directed behavior. Robertson (1983) suggests how error and 
error recovery could be incorporated into a GOMS-like analysis. 

Rumelhart and Norman (1982) present a performance analysis 
of skilled typing that takes the description of errors as a primary 
concern. The treatment of errors in their analysis raises an im- 
portant issue. In order to describe the occurrence of some kinds 
of errors, they were forced to change the assumption of how in- 
formation is stored in memory. The analysis was fundamentally 
altered in order to qualitatively predict the typical errors for the 
task. This raises the question of whether GOMS, in which only 
error-free behavior was analyzed, embodies a representation that 
can be generalized to real performance that includes errors. 


APPLYING WHAT WE KNOW OF THE USER’S 
KNOWLEDGE TO PRACTICAL PROBLEMS 

The foregoing discussions have reviewed various representa- 
tions of the user’s knowledge of a system. We have described 
them in terms of the theoretical representations posited and some 
of the cognitive processes included in each type of analysis. It 
seems safe to conclude that while the area of research on users’ 
mental representations is very active, it is not yet well developed 
(see the Research Recommendations section below). Nevertheless, 
software human factors is an applied area, and there is continual 
pressure to apply what we do know in this work to the task of 
design and training. 

Applying what we know about mental representations to prac- 
tical ends raises many questions. For example, if we knew what 


the user knew, how would we use this knowledge in design? Do 
we build the user interface to reflect a consistent mental model? 
If so, what does the input and presentation look like? Should we 
tell the learner what model to build? 


Designing Interfaces 

If the interface suggests or reflects an appropriate model, then 
the user could conceivably learn it with less guidance and perform 
it with fewer errors. The question is: What should the model be? 

One approach to picking a model is to design user interfaces 
to accord with naive user conceptual models (Carroll and Thomas, 
1982; Mayer and Bayman, 1981). Although this approach is simple 
and straightforward, its general utility is open to question. For 
example, Wright and Bason (1982) designed two software packages 
for a casual user population. One package was designed to be 
maximally consistent with the users* prior knowledge; users were 
asked how they thought about their data and what they wanted 
to be able to do with it, and this formed the basis for the user 
interface. The second package was also designed with input from 
potential users, but in this case, the designer used this information 
to determine how the users ought to think about their data and 
operations on it. The finding was that, in every way, the second 
package was a better design. 

In a similar vein, Landauer et al. (1983) replaced the verbs 
in a word processor’s command names (like append, substitute, 
and delete) with those that secretaries generated most often when 
describing to another secretary how to change a marked up manu- 
script (such as add, change, and omit). Paired associate learning 
theory would have predicted that these well-learned goal-action 
pairs from the secretaries’ own vocabulary would have been good 
command names for secretaries learning a new word processor. 
The goal-action pairs are presumably preexisting paired aa rH nttm 
ones not needing new learning. Learning the word processor with 
these command names, however, was no better than learning the 
one with the system developers’ names or even one with random 
names like allege, cypher, and deliberate. Naive users do not 
necessarily design better systems. 

A variant of the naive model approach is to enter into the 
design process with a preconceived model, and then to iteratively 
build a prototype, test it, and refine the design (including the user’s 
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model) until acceptable usability is attained. This technique is the 
classic empirical approach (Dreyfus, 1955); it has been employed 
in recent designs that use the desktop metaphor in the interface for 
office systems (Bewley et al., 1983; Morgan et al., 1983), as well 
as in other application system designs (Gould and Boies, 1983; 
Wixon et al., 1983). The theoretical problem with this approach 
is that in the context of iterative and often radical redesign of a 
user interface, it is difficult to clearly separate the effect of the 
model on usability from that of other aspects of the redesign. 4 

A second design approach is to reduce the problem of commu- 
nicating an appropriate conceptual model to the user by simpli- 
fying the system and its interface. DuBoulay et al. (1981) stress 
this in their characterization of a glass box model that consists 
of only a small number of components and interactions, all ob- 
viously reflected in the feedback that learners get from running 
the system. Carroll and Carrithers (1984) implemented this ap- 
proach by providing new users with only a small but sufficient 
subset of commands to learn. This small set fits a relatively sim- 
ple conceptual model. Carroll and Carrithers (1984) called this the 
“training wheels” approach, borrowing the analogy from learning 
to ride a bicycle. Once the subset of commands was learned, the 
user was gradually introduced to more complicated or more rarely 
used commands. This approach led to faster and more successful 
learning. An important question raised in this work, however, is 
how to decide which subset of commands is sufficient to do the 
task and fits a simple model. Furthermore, it raises the question 
of how to embellish the initially simplified conceptual model so 
that the change does not disrupt the learning the user has already 
accomplished. 

A third design approach focuses on the method that the user 
learns rather than on the mental model. Moran (1981), Reisner 
(1981, 1984), and Young (1981) all stress the potentie’ utility 
of task-oriented knowledge for design. Such knowledge can be 
represented formally. The suggestion is that these representations 
can be examined or manipulated prior to actual construction of 
the user interface to determine the least complex organization for 


^However, Olson st al. (1984) highlight the importance of running 
prototype tests with two prototypes that differ in only one variable at a 
time, so that the effects of individual design changes can be measured 
independently. 


the interface. For example, the designer can calculate values of 
merit for a system based on the number of rules in a grammar, 
the number of different terminal symbols, or various other metrics 
known in computational linguistics.This approach could also mak e 
it possible to define precisely concepts like consistency: similar 
tasks or goals should be associated with similar or identical actions 
(Moran, 1983). For example, deleting a sentence ought to have 
similar actions to deleting a paragraph. Empirical work has shown 
the importance of such concepts (e.g., Barnard at al., 1981; Black 
ard Sebrechts, 1981; Thomas and Carroll, 1981; but see Landauer 
et al., 1984, for a caveat). 

It should be noted that analysis of these relations, like consis- 
tency, may not go very far toward describing the interface design 
fully. For example, two interfaces with exactly the same grammat- 
ical description of a command language may have very different 
visual layouts. The visual layouts may lead to performance dif- 
ferences not predicted by a calculated complexity measure that 
is based only on inconsistencies in the command language. With 
the exception of Dunsmore (1986), most grammars do not rep- 
resent features of visual layout that are known to be important. 
Dunsmore (1986) predicted and then experimentally verified tW a 
crowded display would be more diflkult for users to deal with than 
an uncrowded one. Furthermore, with the exception of Shneider- 
man (1982), whose multiparty grammars can be used to describe 
both a user’s action and the system’s response, there has been little 
attempt to integrate models of the various components of a sys- 
tem. Moreover, optimizing a design with respect to a task-oriented 
analysis will not necessarily include any of the design considera- 
tions that would be indicated by optimizing the presentation of a 
good mental model. 


User Raining 

If a system has been built to conform to a consistent model or a 
well-formed set of methods, training may simply involve presenting 
the user with the model or methods. Several researchers have been 
concerned with developing techniques for providing users with 
appropriate conceptual models, something that even state-of-the- 
art instructional materials for software often fail to do (Bayman 
and Mayer, 1984; DuBoulay et al., 1981; Halasz and Moran, 1982). 
The benefits from presenting a mental model, however, are unclear. 
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Schlager and Ogden (1986), for example, incorporated both a 
method representation and a mental model in the training mate- 
rials for teaching students how to form successful queries in a data 
base. For those specific query types that fit the model or meth- 
ods presented, both representations speeded learning, regardless 
of whether the representation was a method representation or a 
mental model. Errors and difficulty occurred only when queries 
were different from the method or model taught. 

Mayer (1976, 1981) provided students with a diagratmm.tir 
tool which incorporated a variety of concrete metaphors (e g., 
input as a ticket window and storage as a file cabinet). Students 
who were exposed to this tool before studying a training manual 
were later able to perform better on both programming and recall 
tasks. 

Kieras and Bovair (1984) taught people how a simple device 
worked either by a rote sequence of steps, with a model of the 
system, or with an analogy. The sequence of steps showed them 
what to do when. One model displayed what part was connected 
to another part beneath the surface, as if a flow diagram were 
painted on the control panel. The analogy described the control 
panel as being part of a mock spaceship, explaining what 
control knob did in terms of battle-related actions. The results 
she ved no benefit from either of the models over the rote sequence. 
On closer inspection, Kieras and Poison (1985) noted that neither 
of the models gave the user any action-oriented help; the models 
merely gave a story about what the connections were, not how 
they worked. 

Halas* and Moran (1982) taught students how to use a cal- 
culator using either a step-by-step action sequence to do stan- 
dard calculations or instructions which included a verbal model 
of how the internal registers, windows, and stacks worked. They 
found that performance on standard tasks was identical for the 
two groups, but that the group who learned the model performed 
better on novel tasks. 

Foss et al. (1982) provided a file folder metaphor to students 
learning to use a text editor. They found that students who were 
provided with the metaphor learned more in less time. In the 
same domain, Rumelhart and Norman (1981) used a composite 
of three metaphors: a secretary metaphor, which was used to 
explain that commands can be interspersed with text input; a 
card file metaphor, which was used to describe the deletion of a 
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single numbered line from a file; and a tape recorder metaphor, 
which was used to convey the need for explicit terminators in files. 
Although performance was good overall, the fact that there were 
several metaphors produced cases in which a subject would employ 
one of the metaphors when another was appropriate. 

Most of this work has focused on the use of mental models 
narrowly in training, namely, by telling the student the model or 
by providing simple and explicit advanced organizers (Ausubel, 
19C0). In another approach, an explicit mental model was pre- 
scribed; a system’s training manual had a diagrammatic model of 
control flow for a menu-baaed system (Galamboe et al., 1985). The 
resultant benefits were equivocal, however. Even greater integra- 
tion between model and training appears necessary. The feasibility 
of this approach is exemplified in systems that have mental model 
analyses in their expert systems to interactively diagnose learner 
problems and to provide tailored support (e.g., Burton, 1981). No 
systematic behavioral studies have been carried out, however, to 
evaluate the effectiveness of this approach. 

A more theoretical issue in the area of training pertains fun- 
damentally to the nature of learning and the implications for 
designing training programs. One view of human learning and 
memory conceives of learning as an active process of problem solv- 
ing in which concepts are created by the learner (e.g., Jenkins, 
1974; Wittrock, 1974). This view contrasts with one in which 
learning is merely the storage of concepts in memory. In the latter 
view, a learner can be given a conceptual model explicitly (by 
diagram or a verbal explanation). In the active learning view, 
however, a conceptual model must be invented by the learner after 
an appropriate series of experiences. 

Mayer (1980) adopted the active learner view. He asked learn- 
ers to generate a metaphorical elaboration of programming state- 
ment types as they were learned. For example, after lea-'ing a 
FOR statement, the student was asked to describe its function 
using a metaphorical desktop vocabulary. He found that learners 
who had provided these elaborations were later able to perform 
better on novel and complex programming problems. 

Carroll and Mack (1985) suggested that taking a serious active 
learning view raises the possibility that metaphors are useful not 
only when they provide familiar descriptions of novel experiences, 
but also when they provoke thought by failing to accord perfectly 
with the target of the metaphor comparison. Carroll and Mack 
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(1985) described a learner who was trying to learn a desktop 
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Yet, specifications of how a person would use a mental model 
to predict what a system will do is not sufficient to predict the 
user’s behavior. Our understanding of mental models (if they ex- 
ist) needs to be embedded in a model of a full-blown cognitive 
system, one that has problem-solving and decision-making pro- 
cesses that are sufficient to initiate the model runs, collect the 
results, and decide on an external action. 

2. Investigate whether people have and use mental models of 
various kinds . Probably the most basic question in this area, still 
far from being answered, is whether people construct and use 
mental models at all. And, because of confusion of terminology in 
the literature, behavioral evidence is not clearly supportive. Even 
when we confine ourselves to the specific definition of mental mod- 
els used in this report, however, there is little evidence that people 
have and use mental models. So far, the majority of evidence for 
mental models has come from people’s self-reports that they form 
and use them (which may be post-hoc rationalizations), and from 
some evidence that when taught a system model or analogy, per- 
formance is sometimes better and learning may be faster. Specific 
research is needed to demonstrate whether people have models and 
that their behavior is clearly distinguishable from that produced 
by having stored sequence/method representations. 

3. Determine the behaviors that would demonstrate the model’s 
form and the operations used on it If a person has a mental model, 
there may be some observable behavior that would give an analyst 
evidence of its form. Traditionally, experimental psychologists 
have made inferences about the existence of mental events by 
carefully constructing test situations with systematically varied 
features and observing particular overt responses such as the time 
that it takes to make a certain judgment or carry out an action, 
or the amount and kinds of errors made. The construction of the 
appropriate comparative test situations and the inferences that 
can be drawn from the responses, times, and errors must be based 
on a clearer notion of the form that the model might have and the 
processes that may act on it. 

If the analyst can predict behavior knowing that the person 
has a mental model of a particular sort, then the analyst should be 
able to discover the mental models of other people from systematic 
examination of their behavior. Multidimensional scaling (Shepard 
et al., 1972), unfolding theory (Coombs, 1964), and ordered tree 


analysis (Reitman and Rueter, 1980) are examples of techniques 
that allow the analyst to infer particular mental representations 
from behavior. Perhaps aspects of behavior can reveal the form of 
a working mental model. This work could follow from a program 
of research that built on the theoretical work outlined above. 

4. Explore alternative views of seqvence/method representa- 
tions and the behavior predicted from them. We currently have 
a better conception of what it means to have sequence/method 
representations and what processes may act on them to produce 
behavior than we do of mental models. GOMS represents the 
structure of goals, methods, and actions in a mental hierarchy 
for well-learned cognitive tasks. Kieras and Poison’s (1985) pro- 
duction system formalism and its inference engine (a standard set 
of procedures for keeping track of where one is in a process and 
choosing the subsequent actions) is a concrete specification of this 
kind of knowledge and the processes that act on it. From that for- 
malism follow concrete predictions of behavior, such as particular 
responses (key presses), their times, and the errors. A body of em- 
pirical data is growing, answering questions about which aspects 
of the representation affect behavior. 

What is needed is more research in this vein. Formalisms of 
knowledge and operational mechanisms would be specified and the 
behavior of other kinds of sequence/method representations would 
be predicted. Empirical studies could then be formed to answer 
specific questions about the adequacy of the formalism, in detail, 
replacing the vague generalizations and contradictions that seem 
to plague research in this area today. 

5. Explore the types of mental representations that may ex- 
ist that are not mechanistic. Most of the mental models that 
are conceived in this research are mechanistic in nature. The se- 
quence/method representations are mechanistic and serial. These 
consist of components and processes that mimic physical devices. 
There may be mental representations of other types, however, 
that drive people’s exploratory and explanatory behavior. People 
claim to make inferences and explorations from stored visual and 
auditory images; mathematicians experiment mentally with com- 
putational systems, making inferences before showing any exter- 
nal behavior; people likely reason at different levels of abstraction 
about a system, making inferences of a very general nature in 
planning before exploring details in a step-by-step fashion. There 
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may be visual, auditory, computational, or hierarchical systems 
that form helpful bases for people’s reasoning. These other possi- 
ble types of mental representations should be made concrete, and 
their behavioral correlates should be explored. 

6. Determine how people intermix different representations in 
producing behavior. This report has reviewed a variety of types 
of knowledge that may be held by a user of a computer system. 
It is likely that users have some knowledge stored in several of 
these representations: some well-known procedures for executing 
simple sequences; some well-formed GOMS-like structures for do- 
mg familiar but more complicated tasks; and some mental models 
that help the user explore alternative actions to take when an error 
occurs or when a novel task is presented to them. If all of these rep- 
resentations exist simultaneously, then we need to know when each 
is used and how the person moves between them and/or combines 
their operations or products. There is likely to be some problem- 
solving or decision-making apparatus that guides the overall task 
behavior, sometimes exploring unknown territory with a process 
like means-ends analysis or running a mental model, and other 
times executing well-learned actions from stored goal structures 
(see, for example, the extensive literature on automaticity; Shiffrin 
and Schneider, 1977). An integrated performance view is called 


7. Explore how knowledge about systems is acquired. If we 
can discover the form of the representation of knowledge that 
people have about computer systems, we would like also to know 
how that information was acquired. Lewis’s work (1986) on how 
people make inferences about a system from watching its behavior 
is a good example of how to specify concretely how people learn 
complicated tasks on computers. Work is also needed on how 
people acquire mental models, simple sequences, and methods. 
1 his work would have an impact not only on the design of systems 
and their training, but also would give some basic knowledge about 
the problem of learning complex behavior in general. 


. Vetermine how individual differences have an impact o 
learntng of and performance on systems. Individuals* cognitiv 
capacities differ, making different computer users more and lea 
capable. Some of these differences are likely to arise from simpl 
having more knowledge from longer exposure to the system. Expo 
sure could provide a user with more task knowledge as well as mor 
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specific and more accurate mental models. Some of the differences 
in performance, however, may arise from basic individual differ- 
ences in abilities. For example, Gomez et al. (1983) have shown 
that people who are not good at vijual memory have difficulty 
with some word processors. Further, they found that a system 
that required less recall of a command syntax reduced the perfor- 
mance differences found between those who could recall locations 
and those who could not. We need to know more about individ- 
ual cognitive differences and their concomitant effect on people’s 
mental representations of and performance on complex tasks. The 
results will have implications for both the design of systems and 
the construction of training sequences for a particular system for 
particular users. 

9. Explore the design of training sequences for systems. A 
related training issue surrounds the idea of “training wheels” 
the notion that a scaled-down system is easier to learn initially. 
Specifying and analyzing the mental model or sequence/inethod 
representations implied by the scaled-down system may lead de- 
signers to build more coherent systems and more effective training 
sequences. Further, this analysis may indicate how information 
about the full system should be taught as an add-on to the train- 
ing wheels system. 

10. Provide system designers with tools to help them develop 
interfaces that invoke good representations in users . There is proba- 
bly some guidance that can be provided to systems designers while 
they design the user interface to ensure that the sequence/method 
representation or the mental model will be an effective guide to 
accurate performance. Such tools may come in the form of user in- 
terface management systems; which constrain the design set. The 
goal may be to constrain the ways that the designers can display 
things or constrain the ways that they can allow the user to invoke 
a command so that a coherent, easily understood set is formed, 
one that invokes in the user a good mental model or a coherent 
set of goal-actions pairs. Designing these guidance tools is an im- 
portant research topic, one that can aid the transfer of technology 
from the laboratory to the design and development arena. 

11. Expand the task domain to more complex software . Most 
of the research in the area of mental models and sequence/method 
representations for human-computer interaction has focused on 
text processing and simple device models. Whatever results 
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emerge from these areas should be tested Tor their applicability 
to more complex, nonexclusively text-based tasks, such as graph- 

'^TTf*** deC T i8 !° n makin «> P ro ject planning and tracking, 
and data base query. It is likely that the complexity of these tasks 
m which the user is almost never doing a task that is well-learned 
requires the user to use mental models and to try out actions 
never used before. These may be ideal domains in which to test 

° f T 5 ° f mental models . the productive interaction 
of sequence/method representations and mental models, and the 

Zn mXng ° f problem ' Bolvia g skiUs, reasoning, and deci- 
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